Tree and Graph Mining
نویسنده
چکیده
During the past decade, we have witnessed an explosive growth in our capabilities to both generate and collect data. Various data mining techniques have been proposed and widely employed to discover valid, novel and potentially useful patterns in these data. Data mining involves the discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in huge collections of data. One of the key success stories of data mining research and practice has been the development of efficient algorithms for discovering frequent itemsets – both sequential (Srikant & Agrawal, 1996) and nonsequential (Agrawal & Srikant, 1994). Generally speaking, these algorithms can extract co-occurrences of items (taking or not taking into account the ordering of items) in an efficient manner. Although the use of sets (or sequences) has effectively modeled many application domains, like market basket analysis, medical records, a lot of applications have emerged whose data models do not fit in the traditional concept of a set (or sequence), but require the deployment of richer abstractions, like graphs or trees. Such graphs or trees arise naturally in a number of different application domains including network intrusion, semantic Web, behavioral modeling, VLSI reverse engineering, link analysis and chemical compound classification. Thus, the need to extract complex tree-like or graphlike patterns in massive data collections, for instance, in bioinformatics, semistructured or Web databases, became a necessity. The class of exploratory mining tasks, which deal with discovering patterns in massive databases representing complex interactions among entities, is called Frequent Structure Mining (FSM) (Zaki, 2002). In this article we will highlight some strategic application domains where FSM can help provide significant results and subsequently we will survey the most important algorithms that have been proposed for mining graph-like and tree-like substructures in massive data collections. BACKGROUND
منابع مشابه
A Graph-based Interaction Pattern Discovery for Human Meetings
Mining Human Interaction flow in meetings or general representation of any interaction face to face to meetings is useful to identify the person reaction in dissimilar situation. Activities represent the natural history of the individual and mining methods help to analyze how person delivers their opinion in different ways. Meeting interactions are categorized as propose, comment, acknowledgeme...
متن کاملIncreasing the Efficiency of the Software Architecture Recovery through Spanning Tree Based Maximal Graph Mining Technique
This paper represents a technique for recovering the Software Architecture based on Graph Pattern Matching by the help of mining techniques. Generally Software Architecture is represented in terms of graphs with set of vertices and edges. Finding the frequent data sets is the major step in the software architecture recovery. Many algorithms are proposed in this context, for example Apriori base...
متن کاملMinimum Spanning Tree-based Structural Similarity Clustering for Image Mining with Local Region Outliers
Image mining is more than just an extension of data mining to image domain. Image mining is a technique commonly used to extract knowledge directly from image. Image segmentation is the first step in image mining. We treat image segmentation as graph partitioning problem. In this paper we propose a novel algorithm, Minimum Spanning Tree based Structural Similarity Clustering for Image Mining wi...
متن کاملMining tree-query associations in graphs
New applications of data mining, such as in biology, bioinformatics, or sociology, are faced with large datasets structured as graphs. We introduce a novel class of tree-shaped patterns called tree queries, and present algorithms for mining tree queries and tree-query associations in a large data graph. Novel about our class of patterns is that they can contain constants, and can contain existe...
متن کاملTime and Space Efficient Discovery of Maximal Geometric Subgraphs
A geometric graph is a labeled graph whose vertices are points in the 2D plane with isomorphism invariant under geometric transformations such as translation, rotation, and scaling. While Kuramochi and Karypis (ICDM2002) extensively studied the frequent pattern mining problem for geometric subgraphs, the maximal graph mining has not been considered so far. In this paper, we study the maximal (o...
متن کاملModelling Customer Attraction Prediction in Customer Relation Management using Decision Tree: A Data Mining Approach
In Today’s quality- based competitive world, known as knowledge age, customer attraction is of ultimate importance. In respect to the slogan “customer is always right”, customer relation management is the core of an organizational strategy playing an important role in four aspects of customer identification, customer attraction, customer retaining, and customer satisfaction. Commercial organiza...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009